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Amino acid sequence of human histidine-rich glycoprotein derived 
from the nucleotide sequence of its cDNA. 

Koide T, Foster D, Yoshitake S, Davie EW. 

A lambda gt 1 1 library containing cDNA inserts prepared from human liver 
mRNA has been screened with an affinity-purified antibody to human 
histidine-rich glycoprotein (HRG) and then with a restriction fragment isolated 
from the 5' end of the largest cDNA insert obtained by antibody screening. A 
number of positive clones were identified and shown to code for HRG by DNA 
sequence analysis. A total of 2067 nucleotides were determined by sequencing 3 
overlapping cDNA clones, which included 121 nucleotides of 5-noncoding 
sequence, 54 nucleotides coding for a leader sequence of 18 amino acids, 1521 
nucleotides coding for the mature protein of 507 amino acids, a stop codon of 
TAA, and 352 nucleotides of 3-noncoding sequence followed by a poly(A) tail of 
16 nucleotides. The length of the noncoding sequence of the 3' end differed in 
several clones, but each contained a polyadenylylation or processing sequence of 
AATAAA followed by a poly(A) tail. More than half of the amino acid sequence 
of HRG consisted of five different types of internal repeats. Within the last 3 
internal repeats (type V) 5 there were 12 tandem repetitions of a 5 amino acid 
segment with a consensus sequence of Gly-His-His-Pro-His. This repeated 
portion, referred to as a "histidine-rich region", contained 53% histidine and 
showed a high degree of similarity to a histidine-rich region of high molecular 
weight kininogen. 
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Histidine-rich glycoprotein (HRG) belongs to the cyatatin 
superfamily (7) and appears to be a potential risk factor for 
thrombosis. An increased prevalence of elevated HRG plasma 
levels in patients with venous thrombosis and families with 
thrombophilia has been reported (1). It is interesting to note 
that the genes of four different members of the cystatin super- 
family are located on the distal section of the long arm of chro- 
mosome 3: Stefin A (STFl) on 3q21, Kininogen (KNG) on 
3q26-qter, a-2-HS-glycoprotein (AHSG) on 3q27-q28, and 
HRG on 3q21-qter. To further investigate the evolutionary 
relationship between HRG and members of the cystatin super- 

1 To whom correspondence should be addressed at the Gaubius Lab- 
oratory IWO-TNO, P.O. Box 430, 2300 AK Leiden t The Nether- 
lands. Telephone: +31 71 181509. Fax: +31 71 181904. 
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Exon VII Exon VIII/IX 



:' 5"ttcctt"tgtagGf j 
j CTATGATGTAQAAGCCTTGGACTTGGAAAGCCCGAAA j 
j AACCTTGTCATAAACTGTGAAGTCTTCGACCCTCAGgt j 
i gngttgtct aagcagactttgtcatggcagtgc aqattaagtgacatacg j 
i tacacaaatagtgttgttgcttcctaaagctctatgagtgg^/^/^/ i 
I gtgtgtgagagagagagagagagagagagagagacagagacagagac \ 
| ^^^^^agagggagacagggagagagagaaagagagagaca j 
; gacaqacacgcaagaaaaaagatacagagagtatc ccacaactgggg : 
: aaaggagtgcaa 3* I 

FIG. 1. Part of the HRG gene structure present in cogmid 
C2RBHRG-8. The intron H between exons VIII and IX proposed by 
Koide et al. (5) is indicated by a vertical arrow. Primers used to am- 
plify the predicted intron H are indicated by horizontal arrows below 
exon VIII/IX. The compound repeat is indicated by a line in the 
dashed box. The enlargement of the dashed box shows the sequence of 
the repeat (GenBank Accession No. Z 172 18) in detail. A sequence 
identical to exon VII (6) is shown in capital letters and the microsatet- 
lite repeat sequence is shown in italics. Oligonucleotide primers used 
to amplify the repeat are underlined 

family, we isolated a cosmid that was used to refine the chromo- 
somal localization of HRG by in situ hybridization. In addi- 
tion, we used a di nucleotide repeat polymorphism (2) to local- 
ize HRG on the linkage map of chromosome 3q. 

Using a cDNA (6) probe for HRG, a 45-kb cosmid 
(c2RBHRG-8) was isolated from a cosmid library constructed 
from DN A of a 49.XXXXY lymphoblastoid cell line. By par- 
tial sequence analysis, the presence of exons VII-IX could be 
confirmed. The sequences of exon VII and of the coding part of 
exon IX were found to be identical to the cDNA sequence 
reported by Koide et al (6). We also found the same intron- 
exon boundaries for exon VII as proposed by Koide et al. (5). 
However, by sequencing the predicted boundary between 
exons VIII and IX, we found no intron H in this genomic 
clone. The absence of intron H was confirmed by PCR analy- 
sis of genomic DNA using primers chosen in exon VIII and 
exon IX, that amplify the predicted boundary between these 
exons. Genomic DNA was obtained from freshly collected 
blood from Dutch volunteers as described previously (9). PCR 
was performed in a volume of 50 p\ containing 1 genomic 
DNA, 200 ng of a 5'-primer (5'-CAT GCC ACT TTT GGC 
ACA AAT GGG-3') in exon VIII, 200 ng of a 3 ' primer (5'- 
TTA TTT TGG AAA TGT ATG TGT AAA AAA CAT GG- 
3') in exon IX, 200 pAf dNTP, IX polymerase buffer (Amer- 
sham, UK), and 0.5 unit Taq polymerase (Amersham). Ther- 
mocycling conditions were 1 min at 94 °C (denaturation), 1 
min at 55° C (annealing), and 2 min at 72 °C (extension) for 30 
cycles. In genomic DNA of 40 unrelated individuals, no intron 
was found. This finding is in contrast to the intron localization 
proposed by Koide et al (5). They proposed an intron between 
the codons for amino acids 439 and 440 in the gene for HRG 
(Fig. 1). 

Cosmid c2RBHRG-8 was used for hybridization to meta- 
phase chromosome spreads. Labeling, hybridization, washing, 
and staining conditions were as described by Wijmenga et al. 
(10). Positive hybridization signals were found on the terminal 



region of the long arm of chromosome 3 at q28-q29. No addi- 
tional spots on chromosome 3 nor on other chromosomes were 
seen (Fig. 2). 

Clone c2RBHRG-8 was also used to identify CA-repeat re- 
gions. A highly polymorphic (GT) 6 (GA) 13 (CAGAGA) 4 -com- 
pound repeat was found in intron G, 93 bp 3' of exon VII (Fig. 
1). The repeat was used to perform linkage analysis on HRG in 
40 CEPH reference families. Amplification was carried out as 
described previously (2). Additional marker data were ob- 
tained from the CEPH Genotype Database v6.0. Linkage was 
performed using the program CRI-MAP 2.4. Information from 
29 markers on chromosome 3q from the CEPH database was 
used to insert HRG into the linkage map between the loci 
D3S1427 and D3S1294 (with odds for order at least 1000:1). 
The interlocus distances between D3S1427, HRG and 
D3S1294 were 13.5 and 17.8 cM in the female map and 4.7 and 
2.7 cM in the male map, respectively. The D3S1262 locus, one 
of the Genethon markers (8), haplotypes with HRG. The ob- 
served localization of HRG by in situ hybridization and the 
calculated order on the linkage map are in good agreement 
with the cytogenetic localization of marker D3S1427 on 3q27 
(3). The polymorphic marker in the HRG locus, mapping to 
the most distal band q28-q29 of chromosome 3, provides a 
PCR marker with a PIC of 0.80, which is useful for filling in a 
gap in the linkage map near the telomere. 

Apart from the homology between the cystatin -like seg- 
ments and the homology between the histidine rich region of 
Kininogen and HRG, the evolutionary relationship between 
HRG and Kininogen is even more pronounced when the struc- 
tures of their genes are compared. The intron localization of 
the two cystatin domains of HRG is very similar to the first 
two cystatin domains of Kininogen. Moreover, as a conse- 
quence of the absence of intron H, the entire region that is 
situated C-terminal to the cystatin domains of HRG is en- 
coded by a single exon. This is comparable to the 3 '-exon of the 




FIG. 2. In situ hybridization with c2RBHRG-8. Arrows indicate 
positive hybridization signals. 
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